TwigBuffer: Avoiding Useless Intermediate Solutions Completely in Twig Joins

نویسندگان

  • Jiang Li
  • Junhu Wang
چکیده

Twig pattern matching plays a crucial role in XML data processing. TwigStack [2] is a holistic twig join algorithm that solves the problem in two steps: (1) finding potentially useful intermediate path solutions, (2) merging the intermediate solutions. The algorithm is optimal when the twig pattern has only //-edges, in the sense that no useless partial solutions are generated in the first step (thus expediting the second step and boosting the overall performance). However, when /-edges are present, a large set of useless partial solutions may be produced, which directly downgrades the overall performance. Recently, some improved versions of the algorithm (e.g., TwigStackList and iTwigJoin) have been proposed in an attempt to reduce the number of useless partial solutions when /-edges are involved. However, none of the algorithms can avoid useless partial solutions completely. In this paper, we propose a new algorithm, TwigBuffer, that is guaranteed to completely avoid the useless partial solutions. Our algorithm is based on an ingenious strategy to buffer and manipulate elements in stacks and lists. Experiments show that TwigBuffer significantly outperforms previous algorithms when arbitrary /-edges are present.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards unifying advances in twig join algorithms

Twig joins are key building blocks in current XML indexing systems, and numerous algorithms and useful data structures have been introduced. We give a structured, qualitative analysis of recent advances, which leads to the identification of a number of opportunities for further improvements. Cases where combining competing or orthogonal techniques would be advantageous are highlighted, such as ...

متن کامل

QuickStack: A Fast Algorithm for XML Query Matching

With the increasing popularity of XML for data representation and exchange, much research has been done for providing an efficient way to evaluate twig patterns in an XML database. As a result, many holistic join algorithms have been developed, most of which are derivatives of the well-known TwigStack algorithm. However, these algorithms still apply a two phase processing scheme: first identify...

متن کامل

Fast Matching of Twig Patterns

Twig pattern matching plays a crucial role in xml data processing. Existing twig pattern matching algorithms can be classified into two-phase algorithms and one-phase algorithms. While the two-phase algorithms (e.g., TwigStack) suffer from expensive merging cost, the onephase algorithms (e.g., TwigList, Twig2Stack, HolisticTwigStack) either lack efficient filtering of useless elements, or use o...

متن کامل

Twig Pattern Matching: A Revisit

Twig pattern matching plays a crucial role in xml query processing. In order to reduce the processing time, some existing holistic onephase twig pattern matching algorithms (e.g., HolisticTwigStack [3], TwigFast [5], etc) use the core function getNext of TwigStack [2] to effectively and efficiently filter out the useless elements. However, using getNext as a filter may incur other redundant com...

متن کامل

Fast Optimal Twig Joins

In XML search systems twig queries specify predicates on node values and on the structural relationships between nodes, and a key operation is to join individual query node matches into full twig matches. Linear time twig join algorithms exist, but many non-optimal algorithms with better average-case performance have been introduced recently. These use somewhat simpler data structures that are ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008